Iterative Noise Injection for Scalable Imitation Learning

نویسندگان

  • Michael Laskey
  • Jonathan Lee
  • Wesley Yu-Shu Hsieh
  • Richard Liaw
  • Jeffrey Mahler
  • Roy Fox
  • Kenneth Y. Goldberg
چکیده

One approach to Imitation Learning is Behavior Cloning, in which a robot observes a supervisor and infers a control policy. A known problem with this “off-policy” approach is that the robot’s errors compound when drifting away from the supervisor’s demonstrations. On-policy, techniques alleviate this by iteratively collecting corrective actions for the current robot policy. However, these techniques can be tedious for human supervisors, add significant computation burden, and may visit dangerous states during training. We propose an off-policy approach that injects noise into the supervisor’s policy while demonstrating. This forces the supervisor to demonstrate how to recover from errors. We propose a new algorithm, DART (Disturbances for Augmenting Robot Trajectories), that collects demonstrations with injected noise, and optimizes the noise level to approximate the error of the robot’s trained policy during data collection. We compare DART with DAgger and Behavior Cloning in two domains: in simulation with an algorithmic supervisor on the MuJoCo tasks (Walker, Humanoid, Hopper, Half-Cheetah) and in physical experiments with human supervisors training a Toyota HSR robot to perform grasping in clutter. For high dimensional tasks like Humanoid, DART can be up to 3x faster in computation time and only decreases the supervisor’s cumulative reward by 5% during training, whereas DAgger executes policies that have 80% less cumulative reward than the supervisor. On the grasping in clutter task, DART obtains on average a 62% performance increase over Behavior Cloning.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

UCL+Sheffield at SemEval-2016 Task 8: Imitation learning for AMR parsing with an alpha-bound

We develop a novel transition-based parsing algorithm for the abstract meaning representation parsing task using exact imitation learning, in which the parser learns a statistical model by imitating the actions of an expert on the training data. We then use the imitation learning algorithm DAGGER to improve the performance, and apply an α-bound as a simple noise reduction technique. Our perform...

متن کامل

A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning

Sequential prediction problems such as imitation learning, where future observations depend on previous predictions (actions), violate the common i.i.d. assumptions made in statistical learning. This leads to poor performance in theory and often in practice. Some recent approaches (Daumé III et al., 2009; Ross and Bagnell, 2010) provide stronger guarantees in this setting, but remain somewhat u...

متن کامل

Iterative learning identification and control for dynamic systems described by NARMAX model

A new iterative learning controller is proposed for a general unknown discrete time-varying nonlinear non-affine system represented by NARMAX (Nonlinear Autoregressive Moving Average with eXogenous inputs) model. The proposed controller is composed of an iterative learning neural identifier and an iterative learning controller. Iterative learning control and iterative learning identification ar...

متن کامل

DART: Noise Injection for Robust Imitation Learning

One approach to Imitation Learning is Behavior Cloning, in which a robot observes a supervisor and infers a control policy. A known problem with this “off-policy” approach is that the robot’s errors compound when drifting away from the supervisor’s demonstrations. On-policy, techniques alleviate this by iteratively collecting corrective actions for the current robot policy. However, these techn...

متن کامل

Noise reduction and targeted exploration in imitation learning for Abstract Meaning Representation parsing

Semantic parsers map natural language statements into meaning representations, and must abstract over syntactic phenomena, resolve anaphora, and identify word senses to eliminate ambiguous interpretations. Abstract meaning representation (AMR) is a recent example of one such semantic formalism which, similar to a dependency parse, utilizes a graph to represent relationships between concepts (Ba...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1703.09327  شماره 

صفحات  -

تاریخ انتشار 2017